AITopics | symmetric kl divergence

Collaborating Authors

symmetric kl divergence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reviews: Adversarial Symmetric Variational Autoencoder

Neural Information Processing SystemsOct-7-2024, 20:59:44 GMT

The paper proposes a variant of the Variational Auto-Encoder training objective. It uses adversarial training, to minimize a symmetric KL divergence between the joint distributions of latent and observed variables p(z,x) p(z)p_\theta(x z) and q(z,x) q(x)q_\phi(z x) . The approach is similar to the recent [ Mescheder, Nowozin, Geiger. Adversarial variational bayes: Unifying variational autoencoders and generative adversarial networks, 2016 ] in its joining VAE and GAN-like objective, but it is original in that it minimizes a symmetric KL divergence (with a GAN-like objective), which appears crucial to achieve good quality samples. It is also reminiscent of ALI [ Dumoulin et al.

adversarial symmetric variational autoencoder, eq 10, objective, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC

Zhang, Ruqi, Cooper, A. Feder, De Sa, Christopher

arXiv.org Machine LearningFeb-29-2020

Stochastic gradient Hamiltonian Monte Carlo (SGHMC) is an efficient method for sampling from continuous distributions. It is a faster alternative to HMC: instead of using the whole dataset at each iteration, SGHMC uses only a subsample. This improves performance, but introduces bias that can cause SGHMC to converge to the wrong distribution. One can prevent this using a step size that decays to zero, but such a step size schedule can drastically slow down convergence. To address this tension, we propose a novel second-order SG-MCMC algorithm---AMAGOLD---that infrequently uses Metropolis-Hastings (M-H) corrections to remove bias. The infrequency of corrections amortizes their cost. We prove AMAGOLD converges to the target distribution with a fixed, rather than a diminishing, step size, and that its convergence rate is at most a constant factor slower than a full-batch baseline. We empirically demonstrate AMAGOLD's effectiveness on synthetic distributions, Bayesian logistic regression, and Bayesian neural networks.

amagold, artificial intelligence, upstream oil & gas, (16 more...)

arXiv.org Machine Learning

2003.00193

Country: Europe > Italy (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.73)

Add feedback

Centroid estimation based on symmetric KL divergence for Multinomial text classification problem

Chen, Jiangning, Matzinger, Heinrich, Zhai, Haoyan, Zhou, Mi

arXiv.org Machine LearningOct-24-2018

We define a new method to estimate centroid for text classification based on the symmetric KL-divergence between the distribution of words in training documents and their class centroids. Experiments on several standard data sets indicate that the new method achieves substantial improvements over the traditional classifiers.

classification, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

1808.10261

Country: North America > United States (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)

Add feedback

Measuring the non-asymptotic convergence of sequential Monte Carlo samplers using probabilistic programming

Cusumano-Towner, Marco F., Mansinghka, Vikash K.

arXiv.org Machine LearningMay-6-2017

A key limitation of sampling algorithms for approximate inference is that it is difficult to quantify their approximation error. Widely used sampling schemes, such as sequential importance sampling with resampling and Metropolis-Hastings, produce output samples drawn from a distribution that may be far from the target posterior distribution. This paper shows how to upper-bound the symmetric KL divergence between the output distribution of a broad class of sequential Monte Carlo (SMC) samplers and their target posterior distributions, subject to assumptions about the accuracy of a separate gold-standard sampler. The proposed method applies to samplers that combine multiple particles, multinomial resampling, and rejuvenation kernels. The experiments show the technique being used to estimate bounds on the divergence of SMC samplers for posterior inference in a Bayesian linear regression model and a Dirichlet process mixture model. This paper builds on a growing body of work begun by [1] and [2] into estimating upper bounds on KL divergences between a sampler's output distribution and the posterior. In variational inference, the KL divergence of the variational approximation is the gap between the variational lower bound and the log-evidence.

artificial intelligence, machine learning, sampler, (16 more...)

arXiv.org Machine Learning

1612.02161

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.40)

Industry: Government > Military (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)

Add feedback

Smoothed Hierarchical Dirichlet Process: A Non-Parametric Approach to Constraint Measures

Luo, Cheng, Xiang, Yang, Da Xu, Richard Yi

arXiv.org Machine LearningApr-16-2016

Time-varying mixture densities occur in many scenarios, for example, the distributions of keywords that appear in publications may evolve from year to year, video frame features associated with multiple targets may evolve in a sequence. Any models that realistically cater to this phenomenon must exhibit two important properties: the underlying mixture densities must have an unknown number of mixtures, and there must be some "smoothness" constraints in place for the adjacent mixture densities. The traditional Hierarchical Dirichlet Process (HDP) may be suited to the first property, but certainly not the second. This is due to how each random measure in the lower hierarchies is sampled independent of each other and hence does not facilitate any temporal correlations. To overcome such shortcomings, we proposed a new Smoothed Hierarchical Dirichlet Process (sHDP). The key novelty of this model is that we place a temporal constraint amongst the nearby discrete measures $\{G_j\}$ in the form of symmetric Kullback-Leibler (KL) Divergence with a fixed bound $B$. Although the constraint we place only involves a single scalar value, it nonetheless allows for flexibility in the corresponding successive measures. Remarkably, it also led us to infer the model within the stick-breaking process where the traditional Beta distribution used in stick-breaking is now replaced by a new constraint calculated from $B$. We present the inference algorithm and elaborate on its solutions. Our experiment using NIPS keywords has shown the desirable effect of the model.

artificial intelligence, machine learning, symmetric kl divergence, (14 more...)

arXiv.org Machine Learning

1604.04741

Genre:

Research Report (0.64)
Instructional Material (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback